In cpprb, you can add multistep environment to replay buffer simultaneously.
import numpy as np
from cpprb import ReplayBuffer
rb = ReplayBuffer(32,{"obs": {"shape": 5},
"act": {"shape": 3},
"rew": {},
"next_obs": {"shape": 5},
"done": {}})
steps = 10
rb.get_stored_size() # -> 0
rb.add(obs=np.ones(steps,5),
act=np.zeros(steps,3),
rew=np.ones(steps),
next_obs=np.ones(steps,5),
done=np.zeros(steps))
rb.get_stored_size() # -> steps
The dimension for step must be 0th dimension
The shapes for add
for every environments are stored as add_shape=(-1,*env_shape)
at constructor, s.t. env_shape
is the environment shape.
Only one environment value is used to determine the step size by reshaping to add_shape
, so that user must pass the values of the same step size.